Pesquisa | Portal Regional da BVS

Fully Sparse Fusion for 3D Object Detection.

Li, Yingyan; Fan, Lue; Liu, Yang; Huang, Zehao; Chen, Yuntao; Wang, Naiyan; Zhang, Zhaoxiang.

IEEE Trans Pattern Anal Mach Intell ; PP2024 Apr 22.

Artigo em Inglês | MEDLINE | ID: mdl-38648139

RESUMO

Currently prevalent multi-modal 3D detection methods rely on dense detectors that usually use dense Bird's-Eye-View (BEV) feature maps. However, the cost of such BEV feature maps is quadratic to the detection range, making it not scalable for long-range detection. Recently, LiDAR-only fully sparse architecture has been gaining attention for its high efficiency in long-range perception. In this paper, we study how to develop a multi-modal fully sparse detector. Specifically, our proposed detector integrates the well-studied 2D instance segmentation into the LiDAR side, which is parallel to the 3D instance segmentation part in the LiDAR-only baseline. The proposed instance-based fusion framework maintains full sparsity while overcoming the constraints associated with the LiDAR-only fully sparse detector. Our framework showcases state-of-the-art performance on the widely used nuScenes dataset, Waymo Open Dataset, and the long-range Argoverse 2 dataset. Notably, the inference speed of our proposed method under the long-range perception setting is 2.7× faster than that of other state-of-the-art multimodal 3D detection methods. Code is released at https://github.com/BraveGroup/FullySparseFusion.

Super Sparse 3D Object Detection.

Fan, Lue; Yang, Yuxue; Wang, Feng; Wang, Naiyan; Zhang, Zhaoxiang.

IEEE Trans Pattern Anal Mach Intell ; 45(10): 12490-12505, 2023 Oct.

Artigo em Inglês | MEDLINE | ID: mdl-37318978

RESUMO

As the perception range of LiDAR expands, LiDAR-based 3D object detection contributes ever-increasingly to the long-range perception in autonomous driving. Mainstream 3D object detectors often build dense feature maps, where the cost is quadratic to the perception range, making them hardly scale up to the long-range settings. To enable efficient long-range detection, we first propose a fully sparse object detector termed FSD. FSD is built upon the general sparse voxel encoder and a novel sparse instance recognition (SIR) module. SIR groups the points into instances and applies highly-efficient instance-wise feature extraction. The instance-wise grouping sidesteps the issue of the center feature missing, which hinders the design of the fully sparse architecture. To further enjoy the benefit of fully sparse characteristic, we leverage temporal information to remove data redundancy and propose a super sparse detector named FSD++. FSD++ first generates residual points, which indicate the point changes between consecutive frames. The residual points, along with a few previous foreground points, form the super sparse input data, greatly reducing data redundancy and computational overhead. We comprehensively analyze our method on the large-scale Waymo Open Dataset, and state-of-the-art performance is reported. To showcase the superiority of our method in long-range detection, we also conduct experiments on Argoverse 2 Dataset, where the perception range ([Formula: see text] m) is much larger than Waymo Open Dataset ([Formula: see text] m).

RESUMO

RESUMO

ENVIAR RESULTADO:

SELEÇÃO DE REFERÊNCIAS

DETALHE DA PESQUISA